home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.c
- Path: news.uunet.ca!wildcan!sq!msb
- From: msb@sq.com (Mark Brader)
- Subject: Character set history (was: Recursion)
- Message-ID: <1996Apr13.230051.21741@sq.com>
- Organization: SoftQuad Inc., Toronto, Canada
- References: <31624BC2.70D2@sooner.net> <4kmm8f$ibd@sun001.spd.dsccc.com> <4kmsg0INNln0@keats.ugrad.cs.ubc.ca> <4knhci$67t@solutions.solon.com>
- Date: Sat, 13 Apr 1996 23:00:51 GMT
-
- Kazimir Kylheku <c2a192@ugrad.cs.ubc.ca> writes:
- > > Do you know any character sets in which the digit characters don't
- > > collate in sequence from '0' to '9'?
-
- Peter Seebach (seebs@solon.com) answers:
- > Doesn't matter; ANSI specifically requires that they do, and that they
- > are adjacent. Presumably for precisely this reason.
-
- "ANSI" here means the C standard, and indeed, I believe it's always
- been true for the character sets of machines where C has been implemented.
- If someone ever implements C on a machine whose native character set doesn't
- meet that requirement, they will have to do some sort of transliteration
- or emulation to make it look as though it does.
-
-
- To digress outside of the newsgroup's scope, though, there certainly
- have been character sets historically where the 10 characters
- "0123456789" did not occupy consecutive positions in that order.
-
- [1] The sequence " 1234567890" occurred in BCDIC (the code that EBCDIC
- was nominally an "Extended" version of), and in something called PTTC.
- These were 6-bit character sets used in punch-card days. Thus '8'-'2'
- would be 6 as in ASCII or EBCDIC, but '8'-'0' would be -2, not 8.
-
- [2] On the IBM 7030 or Stretch computer, the digits 0 to 9 occurred in
- order but in consecutive *even* positions in the character set. That
- is, '8'-'2' would be 12, and '8'-'0' would be 16. The odd positions
- were used for subscripts: '8'+1 was a subscript 8.
-
- [3] In the days of real Teletypes and paper tape, there were 5-bit
- character sets. I have a table of one of these, CCITT #2. Its
- sequence of characters looks completely random; the digits are
- scattered all over the place, though always in the same shift
- state. I presume the code was based somehow in the internal
- arrangements of a particular, common model of Teletype. For
- example, '8'-'2' would have been -13 and '8'-'0' would be -1.
-
- The source for this information is a book specifically about the
- history of computer character sets, which I had out of the library
- once but don't seem to have kept a record of the title of. I imagine
- that CCITT #2, at least, is still in use in a few odd places.
- --
- Mark Brader, msb@sq.com "But I do't have a '' key o my termial."
- SoftQuad Inc., Toronto -- Lynn Gold
-
- My text in this article is in the public domain.
-